communication protocol
Learning to Communicate with Deep Multi-Agent Reinforcement Learning
We consider the problem of multiple agents sensing and acting in environments with the goal of maximising their shared utility. In these environments, agents must learn communication protocols in order to share information that is needed to solve the tasks. By embracing deep neural networks, we are able to demonstrate end-to-end learning of protocols in complex environments inspired by communication riddles and multi-agent computer vision problems with partial observability. We propose two approaches for learning in these domains: Reinforced Inter-Agent Learning (RIAL) and Differentiable Inter-Agent Learning (DIAL). The former uses deep Q-learning, while the latter exploits the fact that, during learning, agents can backpropagate error derivatives through (noisy) communication channels. Hence, this approach uses centralised learning but decentralised execution. Our experiments introduce new environments for studying the learning of communication protocols and present a set of engineering innovations that are essential for success in these domains.
- North America > United States > California (0.04)
- Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.68)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- (2 more...)
- North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
- North America > United States > California > Santa Clara County > Stanford (0.14)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
- (7 more...)
- North America > United States (0.15)
- Asia > Middle East > Jordan (0.04)
- North America > United States (0.15)
- Asia > Middle East > Jordan (0.04)
- North America > Canada (0.04)
- Europe > Spain (0.04)
- Information Technology > Artificial Intelligence > Machine Learning (0.94)
- Information Technology > Data Science > Data Mining > Big Data (0.68)
- Information Technology > Communications > Networks (0.67)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
- Education (0.46)
- Government (0.46)
- Europe > United Kingdom > England > Greater London > London (0.05)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Africa > South Africa > Gauteng > Johannesburg (0.04)
Your Title
Consider a team of cooperative players that take actions in a networkedenvironment. At each turn, each player chooses an action and receives a reward that is an unknown function of all the players' actions. The goal of the team of players is to learn to play together the action profile that maximizes the sum of their rewards. However, players cannot observe the actions or rewards of other players, and can only get this information by communicating with their neighbors.
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Asia > Middle East > Jordan (0.04)